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An earlier study i^as extende'd and replicated to 
examine the feasioiiity ,of generating auitiple-choice , test questions 
by transforming sentences from prose instructional material. In the 
firi^ study, a computer-based "aigprithm ' was used to analyze prose 
subt^ct matter and to identify high-information words. Sentences 
containing selected wo^s were then transformed into aultip^le-ohaice 
items by four writers who, generated foils or question' alternativps 
informally and by an algorxthmo-c method. These items were then 
organized into tests and administered to 2U college students before 
and aitdt they had studied the instructional, materials^ In. th^s 
replication^ the tests were administered to 249 high school students, 
and results )*ere combined with thosd" obtained earlier. ' This provided 
stable estimates of item difficulty. Results supported those Obtained 
earlier* Thus, it appears that this i tem-:writing technique is 
feasible, and that algorithmic m^hods of genera ting ' foils pro4uce 
items of reasonably^ good quality. (Th6 prose passage used in the 
study aad examples of test items are appended). (Author/CTM) 
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The purpose of this effor| was to extend or replicate an edrlier study that 
e^camined <he feasibility of generating multiple-choice ttest questions by transforming, 
sentences fronrv prose instructional "materials. Jn that study^ a computer-baied 
.algorithm was'u^ed to analyze prose subject matter and to identify high-informattc.^ 
woMs. Sentences containing selected words were then transformed into multiple- 
choice items by four writers. Nyho generated- foils or question alternatives informally 



^0 » JAH*7$ 1473 tOirrOH or 1 MOVM tt ©•SOLKTt 



UNCLASSIFIED 



ttCUHITY CLAttiriCATiON Of THIS PAOC (Wtxm Of InftMtff 



and by «n «lgorlthmic method* These items Were then organized into tests and 
administered 'to '24 college students before and after they had studied the instructional 
materials.' 

'v. . . 

Ip this replication^ the^tests were administered to 249 high school students, and 
results were combined with those obtained earlier. This provided stable estimates of 
item difficulty! Results Jiuppdrted those obtained earlier. lt»us, it. appears that this 
itemr writing technique is feasible and that algorithmic methods of generating foils 
produce Itfemsbf reasonably good quality. / 
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^ POREWORD 



This research and- development was 9onducted under the sponsorship of -the Defense 
Adyartcod Research Projects Ag4ncy and is relafW to studies of criterion-relerenced 
tc3Ung being conducted at this /Center. Information resulting from this testing will be 
incorporated in a testing nrtanual being prepared by the Navy Personnel Research and 
Development Center. This manual will 'be used operationally by the Cl\ief of NAval 
Education and Training, the Chief of Naval Technical Training, and the Chief of Naval 
Education and Training Support . ^specif itally, the Instructional PrograJn' Development 
Centers). . ' v. 

' . ' ' 7 

A previous report, NPRDC TR 78-23 of luhe 1978, descril>ed the beginning phases of 
a contractual effort aimed at examif\lng the qualTties of ^est Questions wrTftch from a 
variety of methods. This report describlft^ a repli(iation and extenston of that worH. 
Results will be considered in further development of algorithmic procedures for generat- 
ing test questions fjom prose maiterials. ' ' 

Appreciation is expressed to Dr. John R. Bormuth of the University of Chicago, and 
Dr. 3ason Millman^of Cornell University, who were consultants for this project. 

Dr. Pat-Anthon/ Federicp of this CenteV served as the Contracting Office^ Technical 
Representative. . ' ^ - ^ ^ 
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V ^MMARY 

Problem l>nd Background * • • 

Methods Mor writing test Questions or items, particulaely for criterion-reierenced 
testing, ace needed that- are (1) based on a logically defined nelatlonshi^) between th* 
instructional materials and the test, items written to ass^ess learning from those materials, 
and (2) capable of piloducing Items that dan be easily replicated by many test dev^elopers. 
Sifch meth6^s should allow tests to become more sci^tific instruments and contribute to' 
the ^advancement of instructional Research, educational Valuation, and the use 6i test 
data in forming public policy. • •. 

in an earlier study (NPRDC^ TO 
objectively, generating multiple-choice test questions by transforming sentences Ijrom 
prose instructional materials aqfl developing foils or c|uestion alternatives by an\ aU 
gorlthmic method. In that study, selected instructional material wtts computer-analysed 
to identify high information words— those that are relatively rare in AmeVican Eng- 
lish--an<ito determine the text frequency of those words. Twenty high infDrmatlon nouns 
and adjectives— -10 rare singletons, and 10. keywords- -were selected for use as questions 
words. Singletons are high information, words that occur ,,only once In a passagei aiw 
keywords, those that occur more ^ than once. Twenty sentences were then selected idit 
transfer mation into iterri^ by ^our item writers. Five of these sentences included rarii)^ 
, smgleton .nouns; five, rare singleton adjectives; live, keyword nouns; and five, keyword ; 
adjectives. . . ' 

The four item writers transformed the selected sentences by iubstiJuting the question 
words wlth wh-words (who, what, etc.), and generated item"foil» or resfponse alternatives 
both informally and with an algorithmic methdd. This r^enilted In 160 items— 20 seletted 
sentences transformed by four item writers using, two foil methods- -that were organized 
into eight 20-item test f<Jrms. ' These test forms were administered to subjects— three 
to each _ form— before (pretest) and after (posttest) they studied the instructional 
material. Care was taken to ensure that students completed different test forms ory the 
two test occasions. Average pretest. and posttest item difficulty, as determined by the* 
> percentagewof subjects who answered the questfon correctly,* were computed fcr itemsd) 
produced, by each of the four writers, (2).dei'ived from each of the four types of question 
words, and (3.) with foils generated by each of the tWo methods. 

Results indicated that rare singleton nouns and adjectives and keyword adjectives are 
promfsing candidates for use as question words in developing questions that tftst learning 
from pros^. Keyword nouns^ however, are not good candidates. ^,It wis concluded that the 
methods used to generate foils algorithmi^y wete feasible. Although foils produced by 
these "methods were some What easier thAn those generajiedJ^y item writers, they still 
appeared to produce a significant 9t)ift in difficultylfrom pretest to posttes^ wh«ri 
instruction Was provided between testing sessions. ^ • 

Purpose . . ^ 

The purpose of this study was to extend or replicate the earlier study. It is expected 
that the results wjU form, the baslsJ^fK additional development of algorithmic procedures 
4or generating test questions fro^i prose materials. v * 



Approach * y 

■ 1 ' y ' ' 

' * The eight lormi w&e adininistered t6 2*9 higK tchool students before and alter they 
had studied the instruttionai material. For fcK>th pre- and posttest, about 30 students were 
randomW auigned to each of the^test forms. Care was taken» however, to ensure that the 
forms administered to each subject on the two test occasions wefe different. 

To obtain stable estimates of Iterp dlf flcolty, test results from the earlier study were" 
combined with those obtained in this study. Thi»# the total number of subjects was 273 
(24 college students and 2*9 high school students). A repeated jneasures analysis of 
variance was used to examine difference*^ item difficulties betwe^ (1) the four item 
wfiterr,-«^ thr two parts^ of speech ^ftiquejtion wof«s, (3) t1ie two types of text 
frequencies (keyword and rare singletons), (*) the two foil types, and *(5) the two Jest 
occasions. ' • 

Results ^ . ' •• 

4 

1. Items based on rare singleton nouns and adjectives and keyword adjectives 
showed a significant change in hem difficulty from pretest to posttest. Indicating that 
such Items are useful in learning from the type of prose used In the study.. 

2. Items derived from keyword nouns produced low quality Items,. primarily because 
the sentences they occurred In were*usually Introductory sentences of ij^eral natui^. 

3. The two types of foils proved to be almost equally eff^tlve for Teaming, as 
evidenced By the similarity in posttest Item difficulty. . Those generated by item wrltecs, 
however, were considerably harder on the pretest and showed a higher change in Item 
difficulty from pretest to posttest than did those generated algorithi^icaUy.' 

4. No significant differences between item writers were fdund, indicating that the 
sentence transformation methods employed apparently' neutralized the effects of Item 
writer bias that has been found In other studies of item, writing. 



^on 



iqjusions 



The concept of using a corhputer- based algorithm to Analyze prose instructional 
materials and to Idtsntif y h^h information words appears to be workable. . High ' 
Information rar^ singleton npuns or adjectives* as well as keyword adje^tiv€^8 that Occur 
no mbre than three times, appear to be good candidates for question words. ' Keyword 
nouns, however, apparently are not good candidates* particularly when they occur la 
general introductory sentences. y . ' 

Recommendations ^ 

I. Rare singleton nouns- and adjectives and keyword adjectives that occur Infre- 
quently in instructional material should be used to Jelect sentences from prose, passages 
lof transformation^ Into questions that measure heading comprehension. Keyword nouns 
-should not be used, particularly whfen.they occu^ in general Introductory sentences. 

%* Methods of algorlthmlcially generating tolls for multiple-choice versions of 
sentence-derived question^ should' be further, refined and applied 4n a variety of subiect 
matter areas. ^ ^ • \ • '^'^ ' ' 
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INTRODUCTiQf«l. 

Problem ' ' * ; 

/ 

Methods, fc* writing' test questions' or items, particularly for criterion-referenced 
testing, are needed that are (1) based on a log'icaUy defined relationshii) ^between' the 
instructional materials ahd the test 'items written to assess learnirm^rom those materials, 
(2) defined by a set of operations open to public inspection, ai|jM(3) capable of- producing 
items that can-be easily replicated by many test developec^J^fuch methods should allow 
tests to beconpe. more scientific instruments and contfibote to the advancement of 
instructional rese&rch, educational evaluation, and the use <& test data in forming public 
policy.: _ ' - . . \ • 

Backg round . . 

Roid and Finn (1978) attempted to refine a method of objectively generating 
multiple-choice test questions by transforming sentences from prose instructional mate- 
rials and (^svcloping foils of questiop alternatives by an algorithmic method. A prose 
passage on insect development (see appendix), which was written ior approximately the 



high school level, wa^ select€?d ilpr use m\the Roid and Finn study. Items (stems and foils) 
from this passage were devekJfied using the following procedure: 



to test learning 



•I. The selected material was computer-analyzed to identify high ' information 
words--those that are rel^vely rare in American' English- -and to determine the text 
frequency of those words. Twenty high information nouns and adjectives-- 10 rare' 
singletons and 10 keywords- -were selected for use as question words. Singletons are high 

- information words that occur only once in a parage; and keywords, words that occur more 

, than once. 

2. Twenty .sen tence» were then selected for ♦transformation into multiple-Shoice 
items by four* item ^writers. Five of these sentences included rare singleton nouns; five, 
rare singleton adjectives; five, keyword nolins; and five, keyword adjectives. 

« 

3. The stems for these multiple-choice itehns were produced by substituting the 
question words* with wh-words (who, What, etc.). For example^ the rare singleton 
"silverfish" appeared in tlje following sentence: "The most primitive insects, such as the 
silverfish, do not go through metamorphosis." For this sentence, one writer produceQ the 
.following item stem: "The most primitive Insects, such as what, do not go through 
metamorphosis?" Next, for each of the 20 stem items produced, each writer prod uce(J 
two s^ts'of foils or alternatives. One set was produced informally by the writer; and the 
other, by an algorithmic method. For example, for the above item stem, the writer/au- 
thor produced the following^ foils: 

a. Infbrmally- -Butterflies, Silverfish, Canine, and Cicadas.''"^^ ♦* ^ 

b. Algorithmically--Silverfish, Females, individuals, aod Wasps. 

; . ' . 

This process resulted in 160 m^jltiple-choice items: 20 selected sentences trans- 
formed by four item writers using two foil methods. For a given instance, the stems, as 
wfcll as the foils produced informally by the writers, were ^comparable but not identical. 
The foils ^ produced algorithmically, however, were the same across items/writers'. 
Examples are provided in tht appendix. , ^ " 



^JI"" ^^^^ rare singleton and keyword nouns, those sejectedai iJue^t ion 

wtxds were classified semantically using , the method de^id by • Freder ifeksei^^^^^ 

M^^lotl' r^^'^^*- Classified as a concrete, prooessive, tnimate ,K>un ('♦I). Other Tare- 
^I&ITT 1 nouns in the passage that also/met this dassification were then 
select^(j at random to t'reate foil?. Those selected as foih for "sllverfisk" gsinc this 
rnethod.were "females.vfiindi^idu^ls," and ••wasps,'V as. indicated above., - ^ ysing this 
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Figure I. Fredericksen's semantic classification of nouns. ' . 
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To generate foils for the adjective question words, all rare singleton and keyWord 
adjectives m the prose passage (not just those Selected as que3tion words) were Classified 
'IZ"^ f "f"^'c differential techniques (Nunnally, 1967, pp. -536-538). In resejfch using 

^^d'^o^^h^H\''T/; t'^^'T ^'"^ ^yP'^^^^^y classrficd based on their (1) evaluatton (e.g., 
good or bad), (2) potency (e.g., strong or weak), (3) activity (e.g., fast or slow), and 
familiarity (e.g simple»br complex). In addition to these fpur categories, rare singleton 

.'n.ri^^T^L"' P^^'^^ ^^^^ ^'^^'^'^^ ^^^^'^^'"g to Whether or no? 

n L r ^ ^^^'^^••^ as -technical" words. This latter category is particularly useful 
certa^n^iLm^ material, particularly for grouping adjectives that relate to a 

lr..iJ^!if'' ^^'^ classified ycordiTig to these five categories, they were 

analyzed as to their familiarity, using the f^ale-Chall (19f>8) list of 3000 fSmiliar words ^ii 

r""^ ^^^'"^ ^'^^^ « because they were too familiar 

fo!u c*^ ""^2' APP'^o'<l"^.f <^'y 50 adjectives passed this screen .and qualified for use' 
as foils. From this group, foils were developed by randomly selecting t«ose having the 



same cUssif ication as <he adjective quest ioii» words (i.e., as to elevation, potency, etq.). 
' For example, tnose selected for the rare «ingl«rton "pupal" were "nymphal," "parasitic," 
and •lnsect".(see appendix). . ' • . . * * 

From the l^.items, '^ight 20-item\test forms werfe develop^. Each test included 
five itemsJ generated from rare singleton nouns; five,-lrom' keyword nouns; |ive, trqm rare 
singleton' adjectives; ^d five^ from keyword adjectives. In' addition, test forms were 
orgaVirzed sq that each included five items irom each of the four item writers, t«i} itemi 
with foils generated informarty by the item writers, arid ten. items. with fbijs' generated 
algorithmically. The internal consistency reliability estimates (Kuder-Rlchardson Reli- 
abiiity Formula Number 20) averaged .63 for these test forms. 

The.eighjf forms were administered to 2^^ students from the Orfcgon College of 
Education before (pretext). 5hd after Cposttest) they had studied the prose passage on 
insect de/eiopment. 'For both pretest and posttest,. three- subject^ wercwrandomly assigned" 
to each of the eight test forms; care was taken, howler, to ensure that the pretest and 
posttest forms administered to each student Svere different. 

Average pretest and posttest item difficulties, as determined by the percentages of 
students' who answered the it^m correctly, were computed for items (1) produced by each 
of the four writers^ (2) derived from each of the four types of question words, and (3) with 
foils^either generated informally by the writers or algorithmically. Also, a nonparametric 
analysis of variance (ANOVA) (Wilson, 1956) was used to examine differences in item 
difficulties between (1) the four item writers,^) the four question wofd types, (3) the two 
foil types, and (^) the two test occasions. a 

♦ 

Results showed that items based" on rare singleton nouns and adjectives^ and keyword 
adjective? showed a significant change in item difficulty from pretest to posttfest, 
indicating that such items are useful in learning from the type of prose used in the study. 
Items derived frorft keyword nouns, however, produced low quality, items, primarily 
because the sentences they occurred in were usually introductory sentences of a general 
nature. 

The two types of foils proved to be almost equally effective for learning, as 
evidenced by the similarity in posttest item difficulty. Thus, Roid and Finn concluded 
that the methods tbey used for generating foils were feasible. Although foils produced by 
these methods were somewhat easier than those generated by item writers, thfey still 
appeaYed to produce a significant shift in difficulty from pretest to posttest when 
instruction, was provided between testi:ng sessions. 

Finally, the results of the^N^OVA showed a strong- mean' effect for test occasions, 
which indicates that all types of items were effective for learning. There was also a main 
effect for word type, 'which''was caused by the easier item? derived from keyword nouns, 
as noted above. - Finally, th^re were two significant three-way interactions: (1) writers by 
\»^ord type, by pre test-post test and (2) writers by, foil types by pretest-posttest. The first 
was caused, by variations in item difficulties in items produced by the different writers; 
and the second, by the fact that one writer generated better foils than the others. 

Purpose 

The purpose of this study was to extend or replicate the Roid and Finn study. It is 
expected that the r^si^lts will fo^^m the basis for a<;)dltional development of algorithmic 
procedures for generating test questions from prose rtiaterials. ■ ^ - * 



Subjects 



APPROACH : 



The eight torms develop^ in the Roid and Finn study v^ere administered to 2i»S|iligh 
school studeiu* before (pretest) and after (post^st) they had studied" the passage on iftect 
developmeotf^ . For both pretest artd posttest, approximately subjects were randomly 
assighed^to each of the eight test forms. Care was taken, however, to ensure that the 
pretest and posttest fprms administered to each subject were different. 

Analysis 

•■■■*■ / 
For purposes of analysis, test results from the earlier study were combined with those 
. obtained in this study. Tbus, the totW number of subjects was 273 (2* college |*udents 
and 2f^9 high school student^). Since the number x>f subjects rep^nding to eafch test form 
varied from 27 to 38 on the pretest and from 23 to 33 on the^ posttest, it was possible to 
objtain quite stable estimates of item difficulties. A repeated-^measures analysis of 
variance (ANOVA) ('^x2x?x2x2 factorial (tejign) was used to examine differences in 
item dif fk!ultie« betweef> (l^the four 4t«m writ*Sy 42) parts ol .speech i^ectiYesL 

and nouns) of question words, (3) the two types of text frequencies (keyword and rare 
singletons), (4n^ the two foil types (writer's choice and algorithmic), and (5) the two test 
occasions (pretefst and posttest). 

With . 160 items given on two occasions, the analysis had 320 data points, and^ive 
replications per cell. The ANOVA, which was conducted on the item difficulties fpr items 
in each cell of the design, is useful for determining ||ie "instructional sensitivity" of 
items. A significant main effect for, tlae pretest-posttest factor would indicate that 
pretest difficQlties were significantly different from posttest difficulties for all item's. A 
significant interaction effect involving the pretest-posttest factor would indicate that 
. certain types of items differed in the pattern of their pretest and posttest difficulties. 

RESULTS AND DlSCUSSiqi^ , 
ANOVA Results ' . . * 

Table 1, which presents the results of the analysis of variance JANOVA) of item 
difficuftie?, shows that the strongest effect was the main effect for test 09casions (R)* 
This finding indicates that, across all types of items, the percentage of subjects getting 
pretest items correct was lower than the percentage of subjects getting posttest items 
correct. In otl)er words, most items sh9wed instrucfiipqj/sensitivity. Tabhe 2 shows that 
pretest item difficulties averaged tt7,6 percent acro^ all items; and posttest item 
difficulties, TftA percent. This indicates that the subjects did learn by reading from the 
passage, even though nearly half were able to guess the correct answer to most questions 
.j»-the pretest. With four-option r^ultiple-choice items such as tKose used in' this study, 
excellent items should show pretest difficulties nearer to the level of- random guessing 
(25%). ^ ^ . " 

Two important findings of this experiment w^the main effect o^^art of speech (P) 
and the interaction of P M the repeated measure (RP), as shown in Table 1. An ' 
inspection of Tabl» 3--P and RP interaction effects- -reveals that items based on noun 
question words were sijgQificantly easier overall then were items based on adjec- 
tives- -^5;6 vs. 563 percent. Also, 'the difference between pretest and posttest 
difficulties was greater for nouns than for. adjectives (29^5 vs. 2t^A%) (untabled), which 
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, Ta'^ie 1 . . 

Repeated Meas^jres Analysis of. Variance *on Item 
Difficulties of hems of Each-Type 



< r 



Source 



df 



W (Writers) 
F (Foil Type) ^ 

P (Noun vs. Adjective) ' 
S (Keyword vs. Rare Singleton) 
WF 
WP 

FP • , . 

ws 

PS 

WFP - ^ 

WFS 
WPS 
FPS 
WFPS 



Residual 

R (Pr.etest vsjBosttest) 

RW 

RF 

RP 

RS 

RWF ^ 
RWP 
RFp 
RWS 

RFS / 
RPS 

RWFP * 
RWS 
RWPS 
RFPS 
RWFPS . 

Residual 



\ 



»p < .00 r. 

* »p < .003. 
»*»p < .03. 



3 
1 

. 1 
1 
3 
3 
1 
3 
1 
1 

y 

3 
3 
1 
3 

128 

1 

3 

I 

1 

1 

3 

3 

I 

3 

1 

1 

3 

3 

3 

1 

. 3 
128 



.2^ 
.15 
12*99* 
2.22? 

1.66 

.if7 
1.29 
1.33 
1^^.21* 

.11 
. .3* 

.25 . 

.13 

.9^^ 



472.03* 
1.90 
1.37 
<>.76*** 
2.05 

2.5*^ 
3.0<> 
- A3 
9.25** 
20.42* 
1..05 

.61 
2.63 

.11 

.57 
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Table 2 



Meani and Standard^ Deviations of Item Difficulties 
' on Pretest and Posttest 



Pretest 



Posttest 



Type of Item 



Writei^ (WH 

. 1 
2 
3 

Foil (Fh 

Writer's Choice 
Algorithmic 

Part of Speech (Ph 

Noun ' 
Adjective 

Stem Type (Sh 

Keyword 
Rare Singleton 

• * 

Test Forms: . 

2 
3 

6 
7 
8 

All Iten^s 



Mean 



S.D. 



Mean\ 



18.3 
19.2 
20.9 
19.6 



20.1 
\t7 




y 4 



19. 



71.6 
72.7 
75.7 
77.5 



7<».6 
7'».2 



80. 
68. 



75. 
73. <^ 



79.3 

81.9 

77.0 

7'».0 

71.9 

72. 

'67.9 

70.9 



V 



21. <^ 

20.2 
17.2 



19. 
18.9 



21.5 



16.4 
21.5 



20.4 
17.6 
1^.4 
21.1 
16.^ 
13.4 
21.2 
21.2 

19.1 




Table 3 . . 

Means and Standard Deviations^)! Item Difficulties 

* for Various Interac^tion Effects ' ^ 



Rtpeatfe'd Measure (ft^ 



1 • 


Pretest 


Pc^hest 


* Average 




. . - Mean 


S.D. 


Mean 


S.D. 


Mearj, 


S.D. 




7 • ■ —' • ■ 

P and RP Interaction Effects 








!NCHJll 


50.8 


20.1 


80.3 


l^f.l 


65.6 


ltf.7 


Adjective ■ ^ 




18.2 


68. 


21.5 


56.3 


18.1 


Average 

V 


^7.6 ' 


19. «f 


7tf.^' 


19.1 


61.0 ' ; 


17.1 


/ 


PS and R PS 'interaction Effects 

* » 






r • 


Noun-based Item: 






\ 








Keyword 
Rare Singleton 


■. 61.3 

,.jo.* 

1 


18.6 


83.5, 

.-^ 77:-3 


12.8 
lt».8 


72. 
58.8 


,<2.3 
13.8 


Adject ive^b^sed Item: 










\ 


Keyword 
Rare Singleton 


39. <f 
1 


1^.0-. 
20.6 . 


67.^ 
69. if ^ 


15.6 
26.3 


53. 
59*3 


. 13. 
21.6 


Average 


^f7.16 


19. «f ' 


7k. . 


19.1 


'6^1.0 


17.1 



|tFS InteractioiV,Effects 



:ti'oi\E 



Writer's Choice Foil: 

T 

Keyword 
Rare Singlet 




Algorithmic F 

Keyword 
Rare Singleton J 

Average 



52.5 
'»0.2 



^f8.2 
^9.3 

^f7.6 



18,2 
20.2 



18.6 
19.0 

19.^ 



75.3 
7^.0 



75.6 , 
72 .7 (/ 

71* .1* 



U.5 
21.^ 



15. 
21.9 

19.1 



68;9 
57:1 



61.9 
61.0 



J6.3 
17.9 . 



15.8 
18.1 



17.1 



Note . See Taj?le 1 lor*dcfinitions. 
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lnd)c«t« that noun- based items Kad greater instruct ipnal sensitivity, than did adlective^ 
base^ iiems. ' 



, An examination of the PS and RPS interaction effects in Table 3 further reveals the 
source ol .the difference, betw<»en nou^s and adjecti>^i in tl>is study. As shown/ the 
Jvera^je difficulty of items based on keyword nount is 72A pei-gent compared to less^tHan 
60 percent for the other types 6i items. This is because keyword nouns typically occur in 
introductory sentences that are very generaf and that address the main topics o{ the 
entire passage. For example, ia the passage on insect develiopment, the keyword noun 
insects" appears in the very first senterice, whlcti happens to b<^a very general 
|J^ement--"The life of moSt insects, is short but active." Students can usually answer 
^ue*tioo5 derived -from this type of sentence without 'having to read the prose^ssage. 
Also, Ijeyword noun items were relatively easy for subjects to recall on the posttest 
(average item difficulty of 8^^), possibly because they were mentioned several times in 

s assumption supports Filings (1977) hypothesis that the 
ds is reduced by their high text frequency. Although^the 
produced the most difficult items (33.^%) appears to be 
. . ^sis. Table shows that t>ie keyword adjectives occurred 
fewer times than keyword nouns. Thus, Finn's hypothesis does apply, in that higher text 
frequency was reHited to^the easiness of rtems constructed from keywords* Wi^h text 
frequencies of 2 or 3^ the keyword adie(;tives were very close to being r^e singletons. 



the passage (se^ Ja^^e 
information content of rare, 
fact that keyword adjecti 
inconsistent with thats,hy 



Table <f 



' Question Words Selected from the Passage 
and Their Text Frequency 
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Nounsr 


Adjectives . 

* 


Rare Singleton 


Keyword 


Rare Singleton 


Keyword 


Instars 

Cicadas 

Silverfish 

Wasps 

Appetites 


Insect (8) 
Insects (20) 
Metamorphosis (9) 
Egg (8) 
Adult (8) 


Plant-feeding 
Pupal 
Spine*- like 
Self-made ^ 
Worm-lil<e 


'imtnature (3) ' 
Incomplete (2) 
Nymphal (2) 
Aquatic (2) 
Distinctive (2) 



Note. The numbet appearing in pferentWses behind keywords represents text frequency. 

The rare si)y^eton nouns showedli good pattern of pretest and posttest diffix:ulti< 
They had the higr*ist average instructional sensitivity- -^0.^ to 77.3 percent— a diffew 
of 36.9 "p^rceht. The rare singleton adjectiv^s^^ere somewhat easier on the pretest 
more difficult-on the posttest than wei-e the rare singleton n,ouns. 

\' . ■ . ♦ . " 

As shown in Table 1, there, was no main effect for writers (W) or foil typ^ (F), nor 
there a iignificant interaction between writers and foil (WF). This result is som^Wiwi 
surprising in that different writers would be expected to write easier or harder items 
when they were allowed to chctost their own foils. - ' ' 



/Is 



V 



T'abic I .(lex's, show one inter^tion (RFS) -involVing foil type. The means and standard 
deviations of item dKficulties ior ihat interaction are also included in Table 3. As shown, 
of the pbsjtest rheans are very similar. A Newman^Kuels a post^iori test of the 
(jifferenrcs bctwf^cn pretest item difficulties in this interaction, ho we ver^ revealed that, 
among the It^ms with "writer's chok^" foils, th<> rare-sifigieton items were more difficult* 
* on t,r>e pretest than wer# the keywoi^ items (i»0.2 vs. 52.5%). 



Variance Between Writers 

-f- 



» • * 



The varjability of item difficulties across item writers was exahiined to determine 
whether tl^ difficulties of rfems constructed with/'writer's -choice" foils varied more 
across writers than did the difUculties of items constructed with algorithmic foils. It Was 
expected, that some writefs would, choose very difficult foils for a given transformed ( 
sentence; and others, easy foMs. The algorithmic foils, which were chosert at random'from 
matched groups of similar Nyords from the passage, "should be free of any item-writer bias, 
and, hehce, less variable in their effects on item difficulty. 

J ^ ' . ' ■ • 

In examining the variability across writerj, the foqus was on each sentence that was 
transformed by^ each writer. ..^s indicated previously, each of the four item-writers 
produced multiple-choice items (stem and foil) for each of the 20 sentehces selected for 
transformation. It was, therefore, possible to identify four item difficulties for a given 
combination-of sentence and foil technique. For example, for the sentence containing the 
keyword adjective "immature," the four items generated using th«-"wrlter's choice" foil 
method resulted in pretest difficulties of 38, 65, 52, and 37^erint respectively, and 
posttest difficulti^es of 67, 63, /'f, and 52 percent. The pretest anJ>posttest variabilfties 
>vere then calculated acros? these item difficulties, as shown in Table 5. y 

After all variances of Item difficulties across writers were calculated, they were 
subjected to'a repeated^jneasures ANOVA in which the dependent variables were the 
.natural logarithms of the variances (SchefffT, 1959, p. 83). The, design for this analysis . 
was 2 X 2 X 2vx 2 with the following factors* (I) foU type (writer's choice vs. algorithmic), 
(2) part of speech (noun vs. adjective), (3) stem type (keyword vs. rare singleton question 
word), and (<>) the repeated mes^sure ''(pretest vs. posttest). Surprisingly, results showed 
-that there were ho significant, main effects or interactions. For exar^ple, even though the 
average variability of the writer's-choice foil method was 115.31 percent compared to 
73.97^ percent for the algorithmic foil method, the differences was not statistically 
significant. ** i • ■ . 

One important limitation of the present study that &Kould be mentioned is that only 
four item writers were employed. Calculation of variabilities across only four writers is 
clearly susceptible to the influence of any one of the four item- difficulties. With a Urger 
sample of writers, the effects may have been more clearly detectable. 



^ \ ; CONCLUSIONS 



The concept of ^using a computer-based algorithm to analyze prose instructional 
materials and to identify high infornrvation words (i.e., those that are rare in American 
English) app^^s to be workable. High information nouns or adjectives identified as rare 
singletons (those occurring only once in a passage) are apparently good candidates"!^ 
question words. High information adjectives identified as keywords (those occurring more 
th^n once in a passage) also, appear to be good candidates fcir question words^ providing 
they occur only two or three trmes. In contrast, keyword nouns apparently are not good 
ca^idates, particutlarly when they occur in general introductbry sentences; 



Tabled 

Variabilities and Standard Deviations 
of Item Difficulties 











4 

* * 


Item Types 




" ~ir 

Pretest 


'Posttest 


^ Ayerage 


Foil ^ype; 


1 

N 




' » ■ . 

- . V 


— — 


Writer's Choice , 


. Var. 
S.D. 


131139 
11. H 


ior.2i 

10.06 

1 


115.31 
10.7'f 


Algorithmic 


Var. 
S.D. . 


69 . 9(f 
8.36 


8.85 


73.97 
8.(i0 


Part of Speech: 










Noun 


Var. 
S.D. 


9(f.(f5. 
9.72 


' 85. «3 
9.25 


89.93 
9.48^ 

* 


Adjective 
ft^ Type: V 


Var. 
S.D. 

• 


97.30 - 
9.86 


92. k7 
9.62 


' 9(f.85 
9.7(» 

• 


Keyword 


Var. 
S.D. 


102.(9 
10. lb 


( 87,06 " 
^ 9.33 


9<>.'32 
9.71 


Rare Singleton 


, Var. 


89.93 
9.(f8 


90.95 . \ 
9r5<> 


90. H 
9.51 . 



RECOMMENDATIONS 

Rar* singleton noun& and adjectives and keyword adjectives that occur infre- 
quently in instructional material shoufd be used -to select sentences from prose passages 
for transformation into questions that measurt; reading comprehension. Keyword nouns 
should pot be used, particularly whep they occilr in general introductory sentences. 

2. Methods of algorithmically generating f6iJs for multip^e-choic* versions of 
sentence-derived questions, should be furth^ refined an^ applied in a V^iety of 'subject 
rrfatter areas. ' 
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PKOSE I»ASSAG!'; USKD IM, THE MXPERIMENT 



4. INSECT DEVFXOPMENT 



I ht' lilt: of muit iU5ei:U ii b\\mi but actiYc. Very 
l<'\v nist'vts li4Nt' »i sp.111 (»f more Hui.n a yrar. 
hv .1 life \|\m'\v(' nuMii tlir hmt» fiom \*htMi iho 
t UU IN I. Mil to whrn llie fullv ilrv^'lopid .idiilt ili(*s 
! ft s ^M>k al li.i|>|H'ii\ (luiin^ K\\\\ {h'xmmX. 

All iiisiits aI(\cI(»p from r^^s In most cMsrs 
tlu A!' \u\H\\ oiitsulr tlu' ImhIv of the femuUv 
III the f ( \s (\i\r\ in which tlic r^j;s hiitfli inskh* 
the frinaU* tlu* vouni; arc lx>rn "iilive. I'licse in- 

ts, \\\\ \\ as tho ap^uils, w\v saiil to Iw viviparous. 
t\ \ Vip' ;ih ms). 

Insists that hati li from fi»^s iiftrr tliey have 
|u ri» l.iiil arr saiil In In* oviparous (oh-vip'-ali-rus). 

insciis ar<* oviparous lii*in(>xt ciLses oacli 
« v:u piniliKcs a sinulc itDiiiatiirc iiist'ct. However, 
III lAt.im spi'tirs nf parasitic. Wiisps («iu*yrti(l»). 
the t'l;^ mav prodtu t- two or itioi'e yoiin^. 

Most itisnt rn^s are very cUstinctive. si/.r, 
\hapr. or Lolor of tin* e^j; is^dilfereiit, in most 
I ases Inr rai h spei u s of inxwt, 'V\\is enables a 
|i*'rM»ii who lias made a studv of these eggs to 
identiK the mm t t^iat laid tbrirt alj^ost as oasdy 
as if h*- had s<?eii the a(liftt. ^ 

\f<Kt insrt t e4»^s are laid in a place that wiU 
provule • itlur protec tion or f(«»d for the young. 
Prt)tt'(tioh IS f*s|XH ially important to thos<^ insect^ 
that iivtirwiiitrr iti the iMi^g stage. Overwintering 
'meahs that the adult. j/sei't lays its eggs Jin the 
Lite smmiK'r or earlWafk Xhv egf^^rthen are dor- 
mant until the next spring whwi thoy hatch. Most 
of the atfults of these s|H»cies are killed by the 
v*<^irse froSt ffowevcr, tbc^iatchiii^ of Uuvse t*ggs in 
jjhf spriiiKoroduces iif?w iiuhvidntrtMo carry on, 
^ the s|H ries-^ ->v, • ' , y ' 

^fo^fpIaMtrfeetliTig iiis<;nN instinctively lay their 
t'Kgs on •[^)laiit'( lhat the young feed on. This in- 
(reases the immature insects' chances of survival. 
If this fiehi of investigation interests yon, the study 
ami pliotographv f)f insHvt eggs might make a 
g(MKl project 

* After rt'ai hmt; the proper stn^e of dtfv^elopmemt, . 
^he cgv' iviir hatch. Tho'yoimg instH't can use a 
nmnl>er of ways to get out of the egg. Son^e inset^s 

II 



chew their way out Qtherihkveipedal spinelike 
itnicliires, calltKl egg-bursters, which cut through 
the shclK There are some eggs which have s|>ecial 
weak s|K)ts in them. The young iiuoct escai>es 
from tht*S4' eitlu^r by wriggling or by tiding in air 
and l)ursting the shell with internal pressure. 

Aft#r the Egg ^ 

Xfter hatching, all insects, exe<^pt^e most 
primitive, go through a series of steps in develop- 
ment. These st^ps are called metamorphtm*. The 
word metamorphosis comes from two Greek 
words; meta. meani^ng to change, and morpho, 
meaoiitg form: rherefore. mi'tan)ok-phosis means 
a change in form. Tliii change in fono (KXMirs iti 
two different way,< These two ways are called 
complete and incomplete metamorphoisis. The 
>iilost primitive inseits. such as the silverfish, do 
' not go through mt^morphthis. When thvy hatch 
tl>ey kM)k like tlu»^ parentis in every way €xc*ept 
that they are smaller. Their development consists 
of growing larger an^l becoming able to repro- 
diicw. - V . 

Incomplete M«tamorDhoti$ 

Insects which show this type of ihetfmorphDsis 
have young which Iqttlr very much like the tfiults 
of the s|>ocies. The.se ji^nmature insects ^are allied 
nymphs. With the oxcepticm of some acpiatic s|>e- 
cies, the principaUiifferenCes In^tween the nymphs 
iand adults' are in size and the presemcr of wings 
(see ilhistratioii at the right). 

Now think hack to the description of the phy- 
lum to wlrtch insects belong, Arihropo\la. Hemem- 
ber. one o( the chaj-acteristics of thest* animals is 
a hard outer covering cal!e<l an vioskvlcton. The 
exoskeleton is made of a ncmlivitig substance 
called chiti'n (ki'-tin). Chitin is hard ond .stiff and 
has very\ little "stretch." Inside the exoskeleton 
ther^ is very little rcwm for growth. 

In order to grow, the nymph must escape this 
self-made |»rison. It d(H*5 this by secreting a hew 
exoskeleton under the old one, When this new 
skin is complete the old skeleton jpUt^ down the 

' 13 ' 



•Note. Spetlal petmlssion granted by What> InHect'la That ? published by- 
Xerox Education Publications, (c) 1965 Xerox Corp. 
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back and the Iniect walks away and leaves it be- 

hincjl You have probably tfen tome of tmfe dis- 
i arded skin>, called casli, on trtfc tnm ki. 

For a hme after the insect dlscardi its old sbn, 
the new t^xotkeleton U soft This alk>ws the cxo- 
skeleton to expand ami make nx)m for further 
^growth. 

Kikvh of th*' j>erif»r *H*t\vpen molts is called an 
inHar. Soine nyinpK^i throiigli us inuny as eight 
or mort» mstars In ftirc rim*rging as* adults. 

Aquatic species that und(?rgo incomplete meta- 
morphosis must go through one more stef) in de- 
vek>pment As nymphs they breathe by means of 
gills. These gills must be replaced W air-breath- 
ing organs in the adult stage. This is done in the 
lAst nymphal instar When it is tirnt* fqr the adult 
to ^emerge, the nymph rises to \)w surface and 
molts. The fully develo|>etl {idult steps out of the 
• final nymphal skin with fully developed organs 
foi' breathing air. 

Complete Metamorphosis 

' Thu iS the type of nnetamorphosis that most 
(vxtple jire familiar with. Butterflies and moths 
ha\#* complete metamorphosis. There are four 
flistiiK't stagey: egg, larva» pupa, and adult. Sinc^ 
the nduh's^main activity is producing eggs, and 
I'm sure you know what the.se are, we will spend 
our tinu* studying the larva and pupa. 

Tfie larvae's main job in life H to eat and grow. 
TKcv ha\e huo^ appetites, larvae are^Vf»jry differ- 
ent frnni thc/Muhs. They do not have compound 
.-e\»'s. wings, and usually have chewinj^ mouth 
part^ even in those orders wht*re the adults have 
suckmg mouth parts. 

A lar^ may cx^ntinue to eat and grow all sum- 
mer. As (x)Rr weather approaches, it may 'buiH a 
cocoon ami pass into the pupal stage. 

MojH of .tiRvse insects pass the winter inSide the 
cocoon. IkTaitse no activity is visibjc at this time, * 
the pupa has been falsely called a ''rftting slage.** 
Actually a great di^al of activity is going on. The 
wormlike larva is changing into a fully developed 
adult. When the weather is warm again, this adolt 
emerges from the oocoon, mates, lays eggs, and 
starts the whole proces* over again^ 
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Let's Get Togtthtr 

uiMH'ts rt^prmhict* .sexually. This means 
that hT haw eggs that wiU hatch » a mnlr anil a 
fcuiah' of the s|K'd4'N must loatc. The (|ucsti4)n is: 
M«>w i|o thry fiiill^lKirl) (itluT:* 

It hasTW'^ti known for yoiirs that snuw nf 
soniiiK ^na*!*' hv erickets and cicada.s W4Te a tyix* 
of nvithm 1 «ill It. is 4'asy to svc thvw' ins4\ ts 
get tom tlu i Hot what alMint tlie Immh t> that iUy 
not tnak«' nmse; InittiTllies. for instani4'f, 

It has Ihm ii (liseovered that the females 4)^ these 
N|HTies }^\\v 4»ff a distilK tivi^Mhir. This «Ml4»r is 
d4't4'i tahU« h\ inah* inseets 4»ver great ilistances. 
The inal4> IoIKass this si4'nt trail hack to the fc 
mahv 

This hrinus t4) iniiul an interesting e\|HTiHn'nt 
V4HI mijuJit try A fri4*n4l 4»f nnn4' 4MUr t ani»ht a rt> 
(^•ntly 4>m4TU4>4i f4'>nale Pr4)moth4M in4»th. fh* put 
tin* femah' in a s< reen 4 agc ami set it 4)utside his 
win4l4)\v. In less than tW4» Inuirs there* were m4)re 
than twenty Tnal<% hanging on the 4Mitside 4)f the 
cau<- Why <l4>n't von try this with 4»ther kinds of 
insfXts? It W4»ul<l mak<» a i»real seienee |>r4)j<'i*t. 

S^'ienee has us^hI the discovery 4)f th4V<e o<l4>rs t4» 
h4»lp 4'litninate umlesirahh* inseets. It was f4)und 
tliat fetnale tiK*kr4ia< In's uave 4>tf an attraeti\i'-(t4) 
nial4' e4iekr4»at lies) odor. Seientists hijve Ihtu able 
t4> repr4Mhic4> this scent and have hsed it t4» attract* 
iindcs t4> traps. > ^ 

Exercises ^ 
How WttI DM You ittad? 

1. Name and dtscrlbt ttte tttree types of davatopment 
(nsecH can go tttrough^ 

2..>Wt)at advanUcC7stt)er« in insect eggs being latd on 
certain plar?ls? 

3. Wttat is nietamorpttosis? Wtiat are ttte differences 
t>etwten complete and incomplete metamocpttoiis? 
4 Wttat processes tak« plaice during the growft) of in 
sects? ; ' 

5. Can you tttink of any advantages to some insects in 
being t>orn "alive"? 

Itead A ^tttte Mora 

1. Lemmon.^ S.i Al\ About Moths and Bu^erff^es. 
New York: Ran^m House, 1956. 

Ms * . . \ 



Note. Special permission granted'^by What Insect Is That ? published 
by Xerox Education Publlcati^onsi (c) 1965 Xerox C6rp. 



KXAMPLES OF ITEMS PRODUCED VROH TEXT 

. % ^ - • . 

Keyword Noun--- Metan>orpho8l8 . , . 

a. Text Sentence(8): Aftg^ hatching^ all insects, except the most primitive, 

Ro through a series of steps In development. These 
.•steps are cal.led metamorphosis . 

* ■', 

b. Items (Stem and Foils) Produced by Item Writers: 

(1) Wtiat are the series of steps In Insect development called? 

(a) Maturation (c) 'Symbloflls 

(b) ' M etamorphosi s (d) Meltoals 

(2) What are the steps Insects go. through In development calledf 

4 

(a) Metamorphosis . (c) Lar^^a 

(b) Arthropoda (d) Pupa ' y 

( J) Wtiat '>ire a series of Steps In development called? 

(a) Reproduction ' (c) Metamorphosis 

(b) Larvae (dV Changes 



(4) What are the series of steps In Insect development cailed? 

(a) Eno^yfcld ^ (c) Arthorpoda 

(b) Instar (d) Metamorphosis ^ 

-« 

c. Foils Produced Algorljthmlcally : 
r.rowths 

Metamorphosis ^ • " 

Types . ^ . 

Activities N 

Rare Singleton Noun — Sllverflsh . 

a. Text Sentence: The most primitive Insect^ such as the sllverflsh . do 

not go through metamorphosis. 

b. . leiiTns (Stem and Foils) Produced by Item Writers: 

(l).Wliat does not go through metamorphosis? The 
(a) Moth ^ (c)' Nymphs 



U\) Moth V (c) Nymphs 

(b) Silverfl8h \ (d) Butterfly 



(2) What do not go through metamorphosis? The most primitive Insects, 
such as - ^ J . - 

(a) Sllverflsh • (c) 'spiders 

(b) Termites (d) Moths 

t 

O) WJl^t Insects do not go through metamorpho8|.s? The primitive, such as 

^ (a) Eggs , (C) Chltln 

(b) Sllverflsh (d) Butterflies 



(A) The moat ^>rlmlt;lve ins^^tfi, such ae what, do not go through nwt-amorphoil 

(a) liutterf 1 lea" (c) Caninjes 

(b) !>llverfi8h (d). Cicadas', 

Folia Produced AlKorlthml<ially : * . 

Sllverf lah . - " • 

Individuals 

Wasps' ^ • ' ^ 

Keyword Ad j^ct I ver- I mmature , \ , - 

a; Text 'Senrencfi : In most earses, eacrt" tegg produteB a rtngle Iromature insert, 
b. Items (Stem and Foils) Produced by Item Writers: 

(1) Wliat does each egg produoe in most casefs? A single 

(a) Immature insect (c) Adolescent insect 

(b) Adult -insect (d) Mature Insect 

(2) Wl\at;does each egg produce in roost cases? A single , 

(a) Oviparous Insei^ct (c) Mature insect . 

(b) Nymphal insect (d) 'Immature insect^ , 

()) In most cases, vhat does "feach egg produce? A single 

(a) Dormant insect (c) Adult insect 

(b) Adult inject (d) Immature insect . ^ 

(A) What does each egg prodkice?|p||l^ single 

(a) Immature insect (c) Round* insect 
|>) Mature ubsect (d) Adult insect 

c; Foils Produced Algorithmically : 

Complete insect ' ^ . • 

Distinct Insect 
Immature insect 

Incomplete insec^t ^ , * 

Ra.re Singleton Adjective — Pupal , 

a. Text Sentence(9): A larva may continue ta eat and grow all summer. As 

cold weather approaches, it may build a cocoon and 
p^s into the yupaj|^ stage. 

b. .Items (Stem and Foils) Produced by Item Writers: 

(1) What may a larva do as the cold weather approaches? Build a cocoon 
and pass intq the * 

> (a) Nymphal stage (c) Pupal stage * ' ^ * 

(b) Parasitic stage (d) Molt stale r . . 



(2) Ah cold weathor approaohes, n l^rva nmy build a .cocoon and pasd 
into wlhit ? 

(a) Infant Htago (< ) Butterfly stage 

(b) Adult Htago (d) Puj>i|il stage 

(3) Intt) what stage may the larva pa^ a8~cold weather approaches and 
It builds a cocoon? The . ^ 

(a)^harval 8ta«e (c) SkeUotal stage 

^ (c) IMipal stage (d) Nyniphal stage 

As cold wcatlier ,approach((s, what may a larva do? Build a copoon 
and pass Into the ' 



t^^>P^*t sta^o (c) Dormant ' stage 

(b) Hibernation stage (d) Resting stage 

Fo i i s Produced A! gor i thmlcal ly : 

Pupa I stage * 
Nymphal stage 
Parasitic stage 
1 Insect stage 



V 
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